Search CORE

72 research outputs found

Contact Prediction is Hardest for the Most Informative Contacts, but Improves with the Incorporation of Contact Potentials

Author: Grigoryan Gevorg
Holland Jack
Pan Qinxin
Publication venue: Dartmouth Digital Commons
Publication date: 28/06/2018
Field of study

Co-evolution between pairs of residues in a multiple sequence alignment (MSA) of homologous proteins has long been proposed as an indicator of structural contacts. Recently, several methods, such as direct-coupling analysis (DCA) and MetaPSICOV, have been shown to achieve impressive rates of contact prediction by taking advantage of considerable sequence data. In this paper, we show that prediction success rates are highly sensitive to the structural definition of a contact, with more permissive definitions (i.e., those classifying more pairs as true contacts) naturally leading to higher positive predictive rates, but at the expense of the amount of structural information contributed by each contact. Thus, the remaining limitations of contact prediction algorithms are most noticeable in conjunction with geometrically restrictive contacts—precisely those that contribute more information in structure prediction. We suggest that to improve prediction rates for such “informative” contacts one could combine co-evolution scores with additional indicators of contact likelihood. Specifically, we find that when a pair of co-varying positions in an MSA is occupied by residue pairs with favorable statistical contact energies, that pair is more likely to represent a true contact. We show that combining a contact potential metric with DCA or MetaPSICOV performs considerably better than DCA or MetaPSICOV alone, respectively. This is true regardless of contact definition, but especially true for stricter and more informative contact definitions. In summary, this work outlines some remaining challenges to be addressed in contact prediction and proposes and validates a promising direction towards improvement

Dartmouth Digital Commons (Dartmouth College)

Tertiary Alphabet for the Observable Protein Structural Universe

Author: Grigoryan Gevorg
Mackenzie Craig\ O
Zhou Jianfu
Publication venue: Dartmouth Digital Commons
Publication date: 03/11/2016
Field of study

Here, we systematically decompose the known protein structural universe into its basic elements, which we dub tertiary structural motifs (TERMs). A TERM is a compact backbone fragment that captures the secondary, tertiary, and quaternary environments around a given residue, comprising one or more disjoint segments (three on average). We seek the set of universal TERMs that capture all structure in the Protein Data Bank (PDB), finding remarkable degeneracy. Only ∼600 TERMs are sufficient to describe 50% of the PDB at sub-Angstrom resolution. However, more rare geometries also exist, and the overall structural coverage grows logarithmically with the number of TERMs. We go on to show that universal TERMs provide an effective mapping between sequence and structure. We demonstrate that TERM-based statistics alone are sufficient to recapitulate close-to-native sequences given either NMR or X-ray backbones. Furthermore, sequence variability predicted from TERM data agrees closely with evolutionary variation. Finally, locations of TERMs in protein chains can be predicted from sequence alone based on sequence signatures emergent from TERM instances in the PDB. For multisegment motifs, this method identifies spatially adjacent fragments that are not contiguous in sequence—a major bottleneck in structure prediction. Although all TERMs recur in diverse proteins, some appear specialized for certain functions, such as interface formation, metal coordination, or even water binding. Structural biology has benefited greatly from previously observed degeneracies in structure. The decomposition of the known structural universe into a finite set of compact TERMs offers exciting opportunities toward better understanding, design, and prediction of protein structure

PubMed Central

Dartmouth Digital Commons (Dartmouth College)

Computational approaches for the design and prediction of protein-protein interactions

Author: Grigoryan Gevorg, Ph. D. Massachusetts Institute of Technology
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2007
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Biology, 2007.Includes bibliographical references (leaves 167-187).There is a large class of applications in computational structural biology for which atomic-level representation is crucial for understanding the underlying biological phenomena, yet explicit atomic-level modeling is computationally prohibitive. Computational protein design, homology modeling, protein interaction prediction, docking and structure recognition are among these applications. Models that are commonly applied to these problems combine atomic-level representation with assumptions and approximations that make them computationally feasible. In this thesis I focus on several aspects of this type of modeling, analyze its limitations, propose improvements and explore applications to the design and prediction of protein-protein interactions.by Gevorg Grigoryan.Ph.D

DSpace@MIT

Coarse-graining protein energetics in sequence variables

Author: Amy E. Keating
D. de Fontaine
D. Krylov
Dane Morgan
Fei Zhou
Gerbrand Ceder
Gevorg Grigoryan
R. F. Goldstein
Steve R. Lustig
Publication venue: 'American Physical Society (APS)'
Publication date: 03/10/2005
Field of study

We show that cluster expansions (CE), previously used to model solid-state materials with binary or ternary configurational disorder, can be extended to the protein design problem. We present a generalized CE framework, in which properties such as energy can be unambiguously expanded in the amino-acid sequence space. The CE coarse grains over nonsequence degrees of freedom (e.g., side-chain conformations) and thereby simplifies the problem of designing proteins, or predicting the compatibility of a sequence with a given structure, by many orders of magnitude. The CE is physically transparent, and can be evaluated through linear regression on the energies of training sequences. We show, as example, that good prediction accuracy is obtained with up to pairwise interactions for a coiled-coil backbone, and that triplet interactions are important in the energetics of a more globular zinc-finger backbone.Comment: 10 pages, 3 figure

arXiv.org e-Print Archive

Crossref

CERN Document Server

Ultra-Fast Evaluation of Protein Energies Directly from Sequence

Author: Amy E Keating
Dane Morgan
Diana Murray
Fei Zhou
Gerbrand Ceder
Gevorg Grigoryan
Steve R Lustig
Publication venue: Public Library of Science
Publication date: 01/06/2006
Field of study

The structure, function, stability, and many other properties of a protein in a fixed environment are fully specified by its sequence, but in a manner that is difficult to discern. We present a general approach for rapidly mapping sequences directly to their energies on a pre-specified rigid backbone, an important sub-problem in computational protein design and in some methods for protein structure prediction. The cluster expansion (CE) method that we employ can, in principle, be extended to model any computable or measurable protein property directly as a function of sequence. Here we show how CE can be applied to the problem of computational protein design, and use it to derive excellent approximations of physical potentials. The approach provides several attractive advantages. First, following a one-time derivation of a CE expansion, the amount of time necessary to evaluate the energy of a sequence adopting a specified backbone conformation is reduced by a factor of 10(7) compared to standard full-atom methods for the same task. Second, the agreement between two full-atom methods that we tested and their CE sequence-based expressions is very high (root mean square deviation 1.1–4.7 kcal/mol, R(2) = 0.7–1.0). Third, the functional form of the CE energy expression is such that individual terms of the expansion have clear physical interpretations. We derived expressions for the energies of three classic protein design targets—a coiled coil, a zinc finger, and a WW domain—as functions of sequence, and examined the most significant terms. Single-residue and residue-pair interactions are sufficient to accurately capture the energetics of the dimeric coiled coil, whereas higher-order contributions are important for the two more globular folds. For the task of designing novel zinc-finger sequences, a CE-derived energy function provides significantly better solutions than a standard design protocol, in comparable computation time. Given these advantages, CE is likely to find many uses in computational structural modeling

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Single methyl groups can act as toggle switches to specify transmembrane protein-protein interactions

Author: Cohen Emily B.
DiMaio Daniel
Grigoryan Gevorg
He Li
Heim Erin N.
Kragelund Birthe Brandt
Shelar Ashish
Steinocher Helena
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 04/09/2017
Field of study

Copenhagen University Research Information System

Recommended from our members

De novo design of a transmembrane Zn²⁺-transporting four-helix bundle.

Author: Acharya Rudresh
Bhate Manasi P
DeGrado William F
Grabe Michael
Grigoryan Gevorg
Hong Mei
Joh Nathan H
Wang Tuo
Wu Yibing
Publication venue: eScholarship, University of California
Publication date: 01/12/2014
Field of study

The design of functional membrane proteins from first principles represents a grand challenge in chemistry and structural biology. Here, we report the design of a membrane-spanning, four-helical bundle that transports first-row transition metal ions Zn(2+) and Co(2+), but not Ca(2+), across membranes. The conduction path was designed to contain two di-metal binding sites that bind with negative cooperativity. X-ray crystallography and solid-state and solution nuclear magnetic resonance indicate that the overall helical bundle is formed from two tightly interacting pairs of helices, which form individual domains that interact weakly along a more dynamic interface. Vesicle flux experiments show that as Zn(2+) ions diffuse down their concentration gradients, protons are antiported. These experiments illustrate the feasibility of designing membrane proteins with predefined structural and dynamic properties

eScholarship - University of California

Structural analysis of cross α-helical nanotubes provides insight into the designability of filamentous peptide nanomaterials

Author: Beltran Leticia C.
Conticello Vincent P.
Egelman Edward H.
Gnewou Ordy
Grigoryan Gevorg
Juneja Puneet
Modlin Charles
Su Zhangli
Wang Fengbin
Xu Chunfu
Publication venue: Dartmouth Digital Commons
Publication date: 01/12/2021
Field of study

The exquisite structure-function correlations observed in filamentous protein assemblies provide a paradigm for the design of synthetic peptide-based nanomaterials. However, the plasticity of quaternary structure in sequence-space and the lability of helical symmetry present significant challenges to the de novo design and structural analysis of such filaments. Here, we describe a rational approach to design self-assembling peptide nanotubes based on controlling lateral interactions between protofilaments having an unusual cross-α supramolecular architecture. Near-atomic resolution cryo-EM structural analysis of seven designed nanotubes provides insight into the designability of interfaces within these synthetic peptide assemblies and identifies a non-native structural interaction based on a pair of arginine residues. This arginine clasp motif can robustly mediate cohesive interactions between protofilaments within the cross-α nanotubes. The structure of the resultant assemblies can be controlled through the sequence and length of the peptide subunits, which generates synthetic peptide filaments of similar dimensions to flagella and pili

Dartmouth Digital Commons (Dartmouth College)

Protein-Directed Self-Assembly of a Fullerene Crystal

Author: Acharya Rudresh
DeGrado William F
Grigoryan Gevorg
Kim Kook-Han
Kim Nam Hyeong
Kim Yong Ho
Kim Yong-Tae
Ko Dong-Kyun
Murray Christopher B
Paul Jaydeep
Zhang Shao-Qing
Publication venue: Dartmouth Digital Commons
Publication date: 01/04/2016
Field of study

Learning to engineer self-assembly would enable the precise organization of molecules by design to create matter with tailored properties. Here we demonstrate that proteins can direct the self-assembly of buckminsterfullerene (C 60) into ordered superstructures. A previously engineered tetrameric helical bundle binds C 60 in solution, rendering it water soluble. Two tetramers associate with one C 60, promoting further organization revealed in a 1.67-Å crystal structure. Fullerene groups occupy periodic lattice sites, sandwiched between two Tyr residues from adjacent tetramers. Strikingly, the assembly exhibits high charge conductance, whereas both the protein-alone crystal and amorphous C 60 are electrically insulating. The affinity of C 60 for its crystal-binding site is estimated to be in the nanomolar range, with lattices of known protein crystals geometrically compatible with incorporating the motif. Taken together, these findings suggest a new means of organizing fullerene molecules into a rich variety of lattices to generate new properties by design

PubMed Central

eScholarship - University of California

Dartmouth Digital Commons (Dartmouth College)